Learned Sorted Table Search and Static Indexes in Small-Space Data Models

نویسندگان

چکیده

Machine-learning techniques, properly combined with data structures, have resulted in Learned Static Indexes, innovative and powerful tools that speed up Binary Searches the use of additional space respect to table being searched into. Such is devoted machine-learning models. Although their infancy, these are methodologically practically important, due pervasiveness Sorted Table Search procedures. In modern applications, model a key factor, major open question concerning this area assess what extent one can enjoy speeding achieved by Indexes while using constant or nearly constant-space paper, we investigate mentioned (a) introducing two new models, i.e., k-ary Model Synoptic Recursive Index; (b) systematically exploring time–space trade-offs hierarchy existing ones reference software platform Searching on Data, together proposed here. We document novel rather complex trade-off picture, which informative for users as well designers Indexing structures. By adhering extending current benchmarking methodology, experimentally show competitive time space. Our second model, bi-criteria Piece-wise Geometric Index, achieve 0.05% more than taken table, thereby, terms proposals. The Index complement each other quite across various levels internal memory hierarchy. Finally, our findings stimulate research since they highlight need further studies regarding relation Indexes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Filtered Document Retrieval with Frequency-Sorted Indexes

Ranking techniques are effective at finding answers in document collections but can be expensive to evaluate. We propose an evaluation technique that uses early recognition of which documents are likely to be highly ranked to reduce costs; for our test data, queries are evaluated in 2% of the memory of the standard implementation without degradation in retrieval effectiveness. cpu time and disk...

متن کامل

Sorted Kernel Matrices as Cluster Validity Indexes

Two basic issues for data analysis and kernel-machines design are approached in this paper: determining the number of partitions of a clustering task and the parameters of kernels. A distance metric is presented to determine the similarity between kernels and FCM proximity matrices. It is shown that this measure is maximized, as a function of kernel and FCM parameters, when there is coherence w...

متن کامل

Many Sorted Algebraic Data Models for GIS

Although many GIS data models are available, a declarative, operational, well-defined, implementation-independent, and objectoriented language is lacking. Based on the theory of many sorted algebra, this work presents a family of geometric data models. Some geographical data models of urban information systems are illustrated using homomorphism. According to the results, the preferred character...

متن کامل

Total and Partial efficiency indexes in data envelopment analysis

Introduction: Data envelopment analysis (DEA) is a data-oriented method for measuring and benchmarking the relative efficiency of peer decision making units (DMUs) with multiple inputs and multiple outputs. DEA was initiated in 1978 when Charnes, Cooper and Rhodes (CCR) demonstrated how to change a fractional linear measure of efficiency into a linear programming format. This non-parametric app...

متن کامل

Faster Path Indexes for Search in XML Data

This article describes how to implement efficient memory resident path indexes for semi-structured data. Two techniques are introduced, and they are shown to be significantly faster than previous methods when facing path queries using the descendant axis and wild-cards. The first is conceptually simple and combines inverted lists, selectivity estimation, hit expansion and brute force search. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data

سال: 2023

ISSN: ['2306-5729']

DOI: https://doi.org/10.3390/data8030056